Detecting Search Engine Spam from a Trackback Network in Blogspace
نویسندگان
چکیده
We aim to develop a technique to detect search engine optimization (SEO) spam websites. Specifically, we propose four methods for extracting the SEO spam entries from a given trackback network in blogspace that are based on fundamental metrics on a network. Using real data of trackback networks in blogspace, we experimentally evaluate the performance of the proposed methods, and demonstrate that the method of ranking entries based on average degrees of nearest neighbors can be a very promising approach for extracting SEO spam entries.
منابع مشابه
A Novel Approach for Combating Spamdexing in Web using UCINET and SVM Light Tool
Search Engine spam is a web page or a portion of a web page which has been created with the intention of increasing its ranking in search engines. Web spamming refers to actions intended to mislead search engines and give some pages higher ranking than they deserve. Anyone who uses a search engine frequently has most likely encountered a high ranking page that consists of nothing more than a bu...
متن کاملDetecting Stealth Web Pages That Use Click-Through Cloaking
Search spam is an attack on search engines’ ranking algorithms to promote spam links into top search ranking that they do not deserve. Cloaking is a wellknown search spam technique in which spammers serve one page to search-engine crawlers to optimize ranking, but serve a different page to browser users to maximize potential profit. In this experience report, we investigate a different and rela...
متن کاملUsing Semantic Analysis to Classify Search Engine Spam
Search engines have tried many techniques to filter out these spam pages before they can appear on the query results page. In Section 2 we present a collection of current methods that are being used to combat spam. We introduce a new approach to spam detection in Section 3 that uses semantic analysis of textual content as a means of detecting spam. This new approach uses a series of content ana...
متن کاملA Novel Hybrid Approach for Email Spam Detection based on Scatter Search Algorithm and K-Nearest Neighbors
Because cyberspace and Internet predominate in the life of users, in addition to business opportunities and time reductions, threats like information theft, penetration into systems, etc. are included in the field of hardware and software. Security is the top priority to prevent a cyber-attack that users should initially be detecting the type of attacks because virtual environments are not moni...
متن کاملImproving Web Spam Classifiers Using Link Structure (S)
Web spam has been recognized as one of the top challenges in the search engine industry [14]. A lot of recent work has addressed the problem of detecting or demoting web spam, including both content spam [16, 12] and link spam [22, 13]. However, any time an anti-spam technique is developed, spammers will design new spamming techniques to confuse search engine ranking methods and spam detection ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005